4 research outputs found

    Grounding Complex Natural Language Commands for Temporal Tasks in Unseen Environments

    Full text link
    Grounding navigational commands to linear temporal logic (LTL) leverages its unambiguous semantics for reasoning about long-horizon tasks and verifying the satisfaction of temporal constraints. Existing approaches require training data from the specific environment and landmarks that will be used in natural language to understand commands in those environments. We propose Lang2LTL, a modular system and a software package that leverages large language models (LLMs) to ground temporal navigational commands to LTL specifications in environments without prior language data. We comprehensively evaluate Lang2LTL for five well-defined generalization behaviors. Lang2LTL demonstrates the state-of-the-art ability of a single model to ground navigational commands to diverse temporal specifications in 21 city-scaled environments. Finally, we demonstrate a physical robot using Lang2LTL can follow 52 semantically diverse navigational commands in two indoor environments.Comment: Conference on Robot Learning 202

    CAPE: Corrective Actions from Precondition Errors using Large Language Models

    Full text link
    Extracting commonsense knowledge from a large language model (LLM) offers a path to designing intelligent robots. Existing approaches that leverage LLMs for planning are unable to recover when an action fails and often resort to retrying failed actions, without resolving the error's underlying cause. We propose a novel approach (CAPE) that attempts to propose corrective actions to resolve precondition errors during planning. CAPE improves the quality of generated plans by leveraging few-shot reasoning from action preconditions. Our approach enables embodied agents to execute more tasks than baseline methods while ensuring semantic correctness and minimizing re-prompting. In VirtualHome, CAPE generates executable plans while improving a human-annotated plan correctness metric from 28.89% to 49.63% over SayCan. Our improvements transfer to a Boston Dynamics Spot robot initialized with a set of skills (specified in language) and associated preconditions, where CAPE improves the correctness metric of the executed task plans by 76.49% compared to SayCan. Our approach enables the robot to follow natural language commands and robustly recover from failures, which baseline approaches largely cannot resolve or address inefficiently.Comment: 8 pages, 3 figures, Under Review at ICRA 202

    2ndWorkshop on Human-Interactive Robot Learning (HIRL)

    No full text
    With robots poised to enter our daily environments, they will not only need to work for people, but also learn from them. An active area of investigation in the robotics, machine learning, and humanrobot interaction communities is the design of teachable robots that can learn interactively from human input. To refer to these research efforts, we use the umbrella term Human-Interactive Robot Learning (HIRL). While algorithmic solutions for robots learning from people have been investigated in a variety of ways, HIRL, as a fairly new research area, is still lacking: 1) a formal set of definitions to classify related but distinct research problems or solutions, 2) benchmark tasks, interactions, and metrics to evaluate the performance of HIRL algorithms and interactions, and 3) clear long-term research challenges to be addressed by different communities. Last year we began consolidating the needed definitions and vocabulary to enable fruitful discussions between researchers from these interdisciplinary fields, and identified a preliminary list of long, medium, and short-term research problems for the community to tackle, and existing tools and frameworks that can be leveraged to this end. This workshop will build upon these discussions, focusing on promoting the specification and design of HIRL benchmarks
    corecore